Supervised PP - Attachment Disambiguation for Swedish ; ( Combining Unsupervised & Supervised Training Data )

نویسنده

  • Dimitrios Kokkinakis
چکیده

This paper is about the application of Machine Learning techniques to the prepositional-phrase attachment ambiguity problem. Since Machine Learning requires large amounts of training instances, the mixture of unsupervised and restricted supervised acquisition of such data will be also reported. Training was performed both on a subset of the content of the Gothenburg Lexical Database (GLDB), and a combination of instances from large corpora. Testing was performed using a range of different algorithms and metrics. The application language is written Swedish.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Combining Unsupervised and Supervised Methods for PP Attachment Disambiguation

Statistical methods for PP attachment fall into two classes according to the training material used: first, unsupervised methods trained on raw text corpora and second, supervised methods trained on manually disambiguated examples. Usually supervised methods win over unsupervised methods with regard to attachment accuracy. But what if only small sets of manually disambiguated material are avail...

متن کامل

The Effect of Corpus Size in Combining Supervised and Unsupervised Training for Disambiguation

We investigate the effect of corpus size in combining supervised and unsupervised learning for two types of attachment decisions: relative clause attachment and prepositional phrase attachment. The supervised component is Collins’ parser, trained on the Wall Street Journal. The unsupervised component gathers lexical statistics from an unannotated corpus of newswire text. We find that the combin...

متن کامل

Improving PP Attachment Disambiguation in a Rule-based Parser

This paper deals with how to enhance the performance of a rule-based parser using statistical Information. PP (Prepositional Phrase) attachment ambiguity is one of the main ambiguities found in parsing. We therefore conducted some experiments on extracting statistical information for PP attachment from a corpus, and on applying such information to a rule-based parser. Two types of information a...

متن کامل

Corpus Based PP Attachment Ambiguity Resolution with a Semantic Dictionary

This paper deals with two important ambiguities of natural language: prepositional phrase attachment and word sense ambiguity. We propose a new supervised learning method for PPattachment based on a semantically tagged corpus. Because any sufficiently big sense-tagged corpus does not exist, we also propose a new unsupervised context based word sense disambiguation algorithm which amends the tra...

متن کامل

Unsupervised Learning of Syntactic Knowledge: Methods and Measures

Supervised methods for ambiguity resolution learn in "sterile" environments, in absence of syntactic noise. However, in many language engineering applications manually tagged corpora are not available nor easily implemented. On the other side, the "exportability" of disambiguation cues acquired from a given, noise-free, domain (e.g. the Wall Street Journal) to other domains is not obvious. Unsu...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999